|
Automatic Synthesis Tackles Power Tower
By John Silvey, Jimmy Gumulja, Der-yi Sheu, Jeff Scott and Jonathan Tong
Integrated System Design
April 2, 2002 (12:27 p.m. EST)
We design 32-bit RISC microcontroller-based platforms, in both hard and soft formats, that will power next year's third-generation cell phones. Power is a prime concern in these devices. Though custom design of critical blocks in the microprocessor helps us meet optimum power, performance and area goals, it's simply too labor-intensive for large-scale designs, particularly as we migrate to ultradeep-submicron processes and as gate counts rise. Consequently, we have been gradually switching from a semicustom design flow to a synthesis-based methodology to automate our design process and integrate EDA solutions for low-power management.
When we set out to design the latest version of the RISC processor, we planned on using a completely synthesis-driven flow, but were concerned that power consumption would rise to an unacceptable level when the design was left to the automated synthesis technology. We were pleasantly surprised that the opposite was true.
The core design on the platform was a 220k-gate device running at 90 MHz with a 16k two-way set-associative cache. This core and platform will be going on a chip that will ultimately be fabricated in Motorola's own 0.13-micron process technology.
In the past, we used a latch-based design so we could implement data gating as well as clock gating to reduce power consumption. Data gating allowed us to minimize the switching of adders, multipliers and other arithmetical components. But in the new synthesis-based flow, we relied solely on clock gating to reduce power consumption in the clock-driven, edge-based design.
Clock gating is a common power-reduction technique used in many power-critical designs. It gates the clocks of individual synchronous-load-enable register banks instead of feeding the output back to the input when the load-enable condition is low. If this technique isn't used, the registers stay active even when enable is off, which wastes power. The technique also replaces banks of feedback muxes, with the load enable controlling the gating logic on the clock pin (see figure). In this way, when enable is off, the clock doesn't pass through and shut down the register bank.
We used the Synopsys Power Compiler tool, which is part of the Synopsys Design Compiler flow. Here clock gating is automated. Naturally, we were anxious to see if the tool could meet acceptable power-consumption levels. No adjustments to the RTL were required to implement Power Compiler in the flow. We simply synthesized the design, routed it with Avanti's Apollo and were pleased to find that power consumption came in at a respectable 103.5 milliwatts. To help us understand how much power savings we had achieved, we did an experiment and synthesized the design without Power Compiler. Our power-consumption savings on the synthesizable portion of the core (excluding memories) was 51.5 percent.
The automated clock-gating technique yielded other benefits. We used fewer design resources: At least two engineers who had been working on manual clock bay design (buffering, managing insertion delay, manually adding clock gating) were freed to work on other things. In addition, the synthesis-based methodology cut the design cycle in half, from about eight months to four. We estimate that the automated clock gating contributed to about 20 percent of the overall time savings we experienced.
We were also relieved that the technique did not adversely affect the back-end process. Many engineers are reluctant to use automated clock gating because they think that it will affect clock skew, test coverage or area. We found that the entire design was readily managed using synthesis-based solutions. For example, with only a few commands in Apollo, we managed clock skew to a satisfactory 300 picoseconds. Automatic test insertion also was unaffected, and we achieved 100 percent testability.
Overall, since Power Compiler works in conjunction with Design Compiler, we were able to reduce power consumption without adversely affecting the timing of the design.
Given these favorable results, we plan to use these synthesis-based, automated techniques on all future projects.
---
John Silvey is platform project leader; Jimmy Gumulja is M341 cache design leader; Der-yi Sheu is back-end methodology leader; Jeff Scott is M310 block leader; and Jonathan Tong is MMU block leader in Motorola's Architectures and Systems Group (Austin, Texas).
http://www.isdmag.com
Copyright © 2002 CMP Media LLC 4/1/02, Issue # 14154, page 16.
| Related Stories: | | PDF Download Part 1PDF Download Part 2
| |